NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Model Lakes. In EDBT 2025.

https://doi.org/10.48786/edbt.2025.81

Pal, Koyena; Bau, David; Miller, Renée J (January 2025, OpenProceedings.org)
EDBT (Ed.)
Given a set of deep learning models, it can be hard to find models appropriate to a task, understand the models, and characterize how models are different one from another. Currently, practi- tioners rely on manually-written documentation to understand and choose models. However, not all models have complete and reliable documentation. As the number of models increases, the challenges of finding, differentiating, and understanding mod- els become increasingly crucial. Inspired from research on data lakes, we introduce the concept of model lakes. We formalize key model lake tasks, including model attribution, versioning, search, and benchmarking, and discuss fundamental research challenges in the management of large models. We also explore what data management techniques can be brought to bear on the study of large model management.
more » « less
Sparse Feature Circuits: Discovering and Editing Interpretable Causal Graphs in Language Models

Marks, Samuel; Rager, Can; Michaud, Eric J; Belinkov, Yonatan; Bau, David; Mueller, Aaron (January 2025, Open Review)

Full Text Available
Customizing Text-to-Image Models with a Single Image Pair

https://doi.org/10.1145/3680528.3687642

Jones, Maxwell; Wang, Sheng-Yu; Kumari, Nupur; Bau, David; Zhu, Jun-Yan (December 2024, ACM)

Full Text Available
Measuring and Controlling Instruction (In)Stability in Language Model Dialogs

Li, Kenneth; Liu, Tianle; Bashkansky, Naomi; Bau, David; Viégas, Fernanda; Pfister, Hanspeter; Wattenberg, Martin (October 2024, COLM)

System-prompting is a standard tool for customizing language-model chatbots, enabling them to follow a specific instruction. An implicit assumption in the use of system prompts is that they will be stable, so the chatbot will continue to generate text according to the stipulated instructions for the duration of a conversation. We propose a quantitative benchmark to test this assumption, evaluating instruction stability via self-chats between two instructed chatbots. Testing popular models like LLaMA2-chat-70B and GPT-3.5, we reveal a significant instruction drift within eight rounds of conversations. An empirical and theoretical analysis of this phenomenon suggests the transformer attention mechanism plays a role, due to attention decay over long exchanges. To combat attention decay and instruction drift, we propose a lightweight method called split-softmax, which compares favorably against two strong baselines.
more » « less
Full Text Available
Function Vectors in Large Language Models

Todd, Eric; Li, Millicent; Sharma, Arnab; Mueller, Aaron; Wallace, Byron C; Bau, David (May 2024, International Conference on Learning Representations)

We report the presence of a simple neural mechanism that represents an input- output function as a vector within autoregressive transformer language models (LMs). Using causal mediation analysis on a diverse range of in-context-learning (ICL) tasks, we find that a small number attention heads transport a compact representation of the demonstrated task, which we call a function vector (FV). FVs are robust to changes in context, i.e., they trigger execution of the task on inputs such as zero-shot and natural text settings that do not resemble the ICL contexts from which they are collected. We test FVs across a range of tasks, models, and layers and find strong causal effects across settings in middle layers. We investigate the internal structure of FVs and find while that they often contain information that encodes the output space of the function, this information alone is not sufficient to reconstruct an FV. Finally, we test semantic vector composition in FVs, and find that to some extent they can be summed to create vectors that trigger new complex tasks. Our findings show that compact, causal internal vector representations of function abstractions can be explicitly extracted from LLMs.
more » « less
Full Text Available
Token Erasure as a Footprint of Implicit Vocabulary Items in LLMs

https://doi.org/10.18653/v1/2024.emnlp-main.543

Feucht, Sheridan; Atkinson, David; Wallace, Byron C; Bau, David (January 2024, Association for Computational Linguistics)

LLMs process text as sequences of tokens that roughly correspond to words, where less common words are represented by multiple tokens. However, individual tokens are often semantically unrelated to the meanings of the words/concepts they comprise. For example, Llama-2-7b’s tokenizer splits the word “patrolling” into two tokens, “pat” and “rolling”, neither of which correspond to semantically meaningful units like “patrol” or "-ing.” Similarly, the overall meanings of named entities like “Neil Young” and multi-word expressions like “break a leg” cannot be directly inferred from their constituent tokens. Mechanistically, how do LLMs convert such arbitrary groups of tokens into useful higher-level representations? In this work, we find that last token representations of named entities and multi-token words exhibit a pronounced “erasure” effect, where information about previous and current tokens is rapidly forgotten in early layers. Using this observation, we propose a method to “read out” the implicit vocabulary of an autoregressive LLM by examining differences in token representations across layers, and present results of this method for Llama-2-7b and Llama-3-8B. To our knowledge, this is the first attempt to probe the implicit vocabulary of an LLM.
more » « less
Full Text Available
Content-based Search for Deep Generative Models

https://doi.org/10.1145/3610548.3618189

Lu, Daohan; Wang, Sheng-Yu; Kumari, Nupur; Agarwal, Rohan; Tang, Mia; Bau, David; Zhu, Jun-Yan (December 2023, ACM)
Emergent World Representations: Exploring a Sequence Model Trained on a Synthetic Task

Li, Kenneth; Hopkins, Aspen K.; Bau, David; Viégas, Fernanda; Pfister, Hanspeter; Wattenberg, Martin (May 2023, ICLR)

Full Text Available
Future Lens: Anticipating Subsequent Tokens from a Single Hidden State

https://doi.org/10.18653/v1/2023.conll-1.37

Pal, Koyena; Sun, Jiuding; Yuan, Andrew; Wallace, Byron; Bau, David (January 2023, Proceedings of the Conference on Computational Natural Language Learning (CoNLL))

Search for: All records